ClickHouse
Querying Overview
ClickHouse uses structured query language (SQL) for managing and querying data. ClickHouse supports a wide variety of data types including JSON and the cloud version support vectors. A general reference for querying can be found at
https://clickhouse.com/docs/sql-reference
Information on the many SQL functions is at https://clickhouse.com/docs/sql-reference/functions
Datatypes
ClickHouse supports a variety of data types. Many of these map into reasonable JavaScript data types. Details on the data types can be found at https://clickhouse.com/docs/sql-reference/data-types
Date and Timestamp Handling
ClickHouse does not automatically convert database date/time types to JavaScript Date objects. Qarbine automatically converts date oriented values into JavaScript Date objects. To turn this automatic conversion off for a query add the following line to the query specification.
#pragma noDateConversion
When “on” the ClickHouse streaming feature is not available, in which case if there is no LIMIT clause then the complete answer set is retrieved prior to applying any soft maximum number of answer set rows.
Handling JSON Data
ClickHouse supports a JSON datatype that when retrieved by Qarbine is a genuine JavaScript object (versus a string). Legacy SQL centric tools have to handle the string format which typically results in a large burden on whomever or whatever is analyzing the data. With Qarbine the data is in its easy to interact with natural form- no coding necessary to handle dynamic JSON data structures.
Information on functions to interact with JSON data is at https://clickhouse.com/docs/sql-reference/functions/json-functions
Vector Queries
Vector searches can be used to find “similar” content based on an embedding array. The array provides semantic context necessary to perform the search.The notion of “distance” is used to determine similarity. There are a variety of distance functions available as discussed at https://clickhouse.com/docs/sql-reference/functions/distance-functions
A sample query specification is shown below. It references a Qarbine variable with the embedding array value.
SELECT title, plot, year, director,
L2Distance(embedding, [! @embedding!] ) AS score
FROM movie_plots
ORDER BY score ASC
LIMIT 10
Another technique for the distance criteria is shown below
L2Distance(embedding, [! embedding(@searchPhrase, "myAiAssistant") !] ) AS score
It dynamically obtains the embedding array for the text contained in the searchPhrase variable using a configured Qarbine AI Assistant.
For more information see https://clickhouse.com/docs/knowledgebase/vector-search
Qarbine Virtual Queries
There are a few convenience queries which are mainly DBA oriented. These queries are recognized by the Qarbine driver and provide common database information. Any catalog and schema set in the data service definition constrains what is returned. For example, if a catalog is given in the data service, then only schemas in that one catalog are returned.
These virtual query defaults are independent of whatever drop down option is chosen in the Data Source Designer tool. If a specific schema’s information is wanted for example, it must be explicitly given.
| Query | Description |
|---|---|
| list databases | Return a list of visible databases. |
| list tables [DATABASE] | Return a list of tables. The optional argument may be a database name. |
| describe tables [DATABASE] | Provide details on all of the tables. The optional argument may be of the form “schema” or “catalog.schema”. This may take a while depending on your database structure. |
| describe table TABLE | Provide details on the given table. |
See the “DBA Productivity” section of the online documentation for more details.
Troubleshooting
If errors occur while using Qarbine then a good course of action is to run the query using the ClickHouse online SQL console. After logging into the console click
Information on using the ClickHouse SQL console is at
https://clickhouse.com/docs/cloud/get-started/sql-console